A Stochastic Composite Augmented Lagrangian Method for Reinforcement Learning
نویسندگان
چکیده
In this paper, we consider the linear programming (LP) formulation for deep reinforcement learning. The number of constraints depends on size state and action spaces, which makes problem intractable in large or continuous environments. general augmented Lagrangian method suffers double-sampling obstacle solving program. Motivated from updates multipliers, overcome obstacles minimizing function by replacing conditional expectations with multipliers. Therefore, a parameterized is proposed. replacement provides promising breakthrough to integrate two steps into single quadratic penalty problem. A theoretical analysis shows that solutions generated sequence constrained optimization converge optimal solution program if error controlled properly. algorithm without using target networks under neural tangent kernel setting residual can be arbitrarily small parameter network chosen suitably. Preliminary experiments illustrate our competitive other state-of-the-art algorithms.
منابع مشابه
The proximal augmented Lagrangian method for nonsmooth composite optimization
We study a class of optimization problems in which the objective function is given by the sum of a differentiable but possibly nonconvex component and a nondifferentiable convex regularization term. We introduce an auxiliary variable to separate the objective function components and utilize the Moreau envelope of the regularization term to derive the proximal augmented Lagrangian – a continuous...
متن کاملAugmented Lagrangian Filter Method∗
We introduce a filter mechanism to force convergence for augmented Lagrangian methods for nonlinear programming. In contrast to traditional augmented Lagrangian methods, our approach does not require the use of forcing sequences that drive the first-order error to zero. Instead, we employ a filter to drive the optimality measures to zero. Our algorithm is flexible in the sense that it allows fo...
متن کاملSolving Environmental/Economic Power Dispatch Problem by a Trust Region Based Augmented Lagrangian Method
This paper proposes a Trust-Region Based Augmented Method (TRALM) to solve a combined Environmental and Economic Power Dispatch (EEPD) problem. The EEPD problem is a multi-objective problem with competing and non-commensurable objectives. The TRALM produces a set of non-dominated Pareto optimal solutions for the problem. Fuzzy set theory is employed to extract a compromise non-dominated sol...
متن کاملAn augmented Lagrangian method for distributed optimization
We propose a novel distributed method for convex optimization problems with a certain separability structure. The method is based on the augmented Lagrangian framework. We analyze its convergence and provide an application to two network models, as well as to a two-stage stochastic optimization problem. The proposed method compares favorably to two augmented Lagrangian decomposition methods kno...
متن کاملPENNON A Generalized Augmented Lagrangian Method for Semidefinite Programming
This article describes a generalization of the PBM method by Ben-Tal and Zibulevsky to convex semidefinite programming problems. The algorithm used is a generalized version of the Augmented Lagrangian method. We present details of this algorithm as implemented in a new code PENNON. The code can also solve second-order conic programming (SOCP) problems, as well as problems with a mixture of SDP,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Siam Journal on Optimization
سال: 2023
ISSN: ['1095-7189', '1052-6234']
DOI: https://doi.org/10.1137/21m1421726